Automatic Translation Template Acquisition Based on Bilingual Structure Alignment

نویسندگان

  • Yajuan Lü
  • Ming Zhou
  • Sheng Li
  • Changning Huang
  • Tiejun Zhao
چکیده

Knowledge acquisition is a bottleneck in machine translation and many NLP tasks. A method for automatically acquiring translation templates from bilingual corpora is proposed in this paper. Bilingual sentence pairs are first aligned in syntactic structure by combining a language parsing with a statistical bilingual language model. The alignment results are used to extract translation templates which turn out to be very useful in real machine translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approach to Automatic Translation Template Acquisition Based on Unannotated Bilingual Grammar Induction

In this paper, we propose a new approach which can automatically acquire translation templates from the unannotated bilingual spoken language corpora in the domain of travel information accessing. In the approach, two basic algorithms named grammar induction algorithm and dynamic programming algorithm are adopted. Our approach is an unsupervised, statistical, data-driven method which avoids the...

متن کامل

Automatic Spoken Language Translation Template Acquisition Based on Boosting Structure Extraction and Alignment

In this paper, we propose a new approach for acquiring translation templates automatically from unannotated bilingual spoken language corpora. Two basic algorithms are adopted: a grammar induction algorithm, and an alignment algorithm using Bracketing Transduction Grammar. The approach is unsupervised, statistical, data-driven, and employs no parsing procedure. The acquisition procedure consist...

متن کامل

Learning Translation Templates From Bilingual Text

This paper proposes a two-phase example-based machine translation methodology which develops translation templates from examples and then translates using template matching. This method improves translation quality and facilitates customization of machine translation systems. This paper focuses on the automatic learning of translation templates. A translation template is a bilingual pair of sen...

متن کامل

Data-driven Amharic-English Bilingual Lexicon Acquisition

This paper describes a simple approach of statistical language modelling for bilingual lexicon acquisition from Amharic-English parallel corpora. The goal is to induce a seed translation lexicon from sentence-aligned corpora. The seed translation lexicon contains matches of Amharic lexemes to weekly inflected English words. Purely statistical measures of term distribution are used as the basis ...

متن کامل

Automatic Construction of Translation Knowledge for Corpus-based Machine Translation

Many machine translation (MT) systems that utilize the knowledge automatically acquired from bilingual corpora have been proposed in conjunction with efforts to accumulate corpora. We call this approach corpus-based machine translation in this thesis. This thesis focuses on automatic construction of the translation knowledge needed for corpus-based MT and discusses the following three tasks. 1....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCLCLP

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2001